Building Large Language Models
Examples
TI-84 GPT4All and Youtube
How to add custom GPTs to any website in minutes.
Libraries
Run a variety of LLMs locally using Ollama
Collecting Data
Scripts to convert Libgen to txt (see also Explaining LLMs) ## Technical Details
from my question to Metaphor.systems
https://jalammar.github.io/illustrated-transformer/ https://huggingface.co/
Google’s free BERT model, a small-sized model for language
OpenAI Cookbook a Github repo of examples of using the OpenAI API.
How to Build
2023 summary from Simon Willison: a good list of resources for how to build your own LLM.
The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI
deep dive into the viral Transformers Math 101 article and high-performance distributed training for Transformers-based architectures.
An observation on Generalization: 1 hr talk by Ilya Sutskever, OpenAI’s Chief scientist. He’s previously talked about how compression may be all you need for intelligence. In this lecture, he builds on the ideas of Kolmogorov complexity and how neural networks are implicitly seeking for simplicity in the representations that they learn. He provides a clarity of thought that is rarely seen in the industry around generalization of these novel systems.
Brendan Bycroft wrote a well-done step-by-step visualization of how an LLM works
Welcome to the walkthrough of the GPT large language model! Here we’ll explore the model nano-gpt, with a mere 85,000 parameters.
Its goal is a simple one: take a sequence of six letters:
C B A B B C and sort them in alphabetical order, i.e. to “ABBBCC”.